7 research outputs found

    Towards the first machine translation system for Sumerian transliterations

    Get PDF
    The Sumerian cuneiform script was invented more than 5,000 years ago and represents one of the oldest in history. We present the first attempt to translate Sumerian texts into English automatically. We publicly release high-quality corpora for standardized training and evaluation and report results on experiments with supervised, phrase-based, and transfer learning techniques for machine translation. Quantitative and qualitative evaluations indicate the usefulness of the translations. Our proposed methodology provides a broader audience of researchers with novel access to the data, accelerates the costly and time-consuming manual translation process, and helps them better explore the relationships between Sumerian cuneiform and Mesopotamian culture

    Towards a linked open data edition of Sumerian corpora

    Get PDF
    Linguistic Linked Open Data (LLOD) is a flourishing line of research in the language resource community, so far mostly adopted for selected aspects of linguistics, natural language processing and the semantic web, as well as for practical applications in localization and lexicography. Yet, computational philology seems to be somewhat decoupled from the recent progress in this area: even though LOD as a concept is gaining significant popularity in Digital Humanities, existing LLOD standards and vocabularies are not widely used in this community, and philological resources are underrepresented in the LLOD cloud diagram (http://linguistic-lod.org/llod-cloud). In this paper, we present an application of Linguistic Linked Open Data in Assyriology. We describe the LLOD edition of a linguistically annotated corpus of Sumerian, as well as its linking with lexical resources, repositories of annotation terminology, and the museum collections in which the artifacts bearing these texts are kept. The chosen corpus is the Electronic Text Corpus of Sumerian Royal Inscriptions, a well curated and linguistically annotated archive of Sumerian text, in preparation for the creating and linking of other corpora of cuneiform texts, such as the corpus of Ur III administrative and legal Sumerian texts, as part of the Machine Translation and Automated Analysis of Cuneiform Languages project (https://cdli-gh.github.io/mtaac/)

    Machine Translation and Automated Analysis of Cuneiform Languages (MTAAC)

    Get PDF
    Project Abstract: Ancient Mesopotamia, birthplace of writing, has produced vast numbers of cuneiform tablets that only a handful of highly specialized scholars are able to read. The task of studying them is so labor intensive that the vast majority have not yet been translated, with the result that their contents are not accessible either to historians in other fields or to the wider public. This project will develop and apply new computerised methods to translate and analyse the contents of some 67,000 highly standardised administrative documents from southern Mesopotamia from the 21st century BC. By automating these basic but labor-intensive processes, we will free up scholars’ time. The tools that we will develop, combining machine learning, statistical and neural machine translation technologies, may then be applied to other ancient languages. Similarly, the translations themselves, and the historical, social and economic data extracted from them, will be made publicly available on the web

    When linguistics meets web technologies. Recent advances in modelling linguistic linked data

    Get PDF
    This article provides an up-to-date and comprehensive survey of models (including vocabularies, taxonomies and ontologies) used for representing linguistic linked data (LLD). It focuses on the latest developments in the area and both builds upon and complements previous works covering similar territory. The article begins with an overview of recent trends which have had an impact on linked data models and vocabularies, such as the growing influence of the FAIR guidelines, the funding of several major projects in which LLD is a key component, and the increasing importance of the relationship of the digital humanities with LLD. Next, we give an overview of some of the most well known vocabularies and models in LLD. After this we look at some of the latest developments in community standards and initiatives such as OntoLex-Lemon as well as recent work which has been in carried out in corpora and annotation and LLD including a discussion of the LLD metadata vocabularies META-SHARE and lime and language identifiers. In the following part of the paper we look at work which has been realised in a number of recent projects and which has a significant impact on LLD vocabularies and models

    Abstract : Cuneiform Digital Library Initiative White Paper for the Global Philology Project

    No full text
    Talk Abstrac

    Comparison of the Two Periods of Ismet Inonu Era in Terms of Religious Freedom (1938-1945 / 1945-1950)

    Get PDF
    The term freedom has been a hot topic since ancient Greece throughout the known history and has been debated in every society. For that reason, it has been given many different definitions and has become on demand. On the way to this term throughout the history, different milestones were determined and different definitions were attached to it. When it came to the sixteenth century, sovereign states misused their authority so as to violate human freedom and the term gain some different aspects especially with deprivation of ownership after that time

    Gerardo Castillo (2001). AnatomĂ­a de una historia de amor. Amor soñado y amor vivido. Pamplona: Eunsa [RECENSIÓN]

    Get PDF
    This paper describes work on the morphological and syntactic annotation of Sumerian cuneiform as a model for low resource languages in general. Cuneiform texts are invaluable sources for the study of history, languages, economy, and cultures of Ancient Mesopotamia and its surrounding regions. Assyriology, the discipline dedicated to their study, has vast research potential, but lacks the modern means for computational processing and analysis. Our project, Machine Translation and Automated Analysis of Cuneiform Languages, aims to fill this gap by bringing together corpus data, lexical data, linguistic annotations and object metadata. The project’s main goal is to build a pipeline for machine translation and annotation of Sumerian Ur III administrative texts. The rich and structured data is then to be made accessible in the form of (Linguistic) Linked Open Data (LLOD), which should open them to a larger research community. Our contribution is two-fold: in terms of language technology, our work represents the first attempt to develop an integrative infrastructure for the annotation of morphology and syntax on the basis of RDF technologies and LLOD resources. With respect to Assyriology, we work towards producing the first syntactically annotated corpus of Sumerian
    corecore